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SUMMARY 

Posterior distribution over a countable set M of continuous data-sampling distri- 
butions piles up at L-projection of the true distribution r on M, provided that 
the L-projection is unique. If there are several //-projections of r on M, then the 
posterior probability splits among them equally. 
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1 Introduction 

Walker [Hj has recently considered consistency of posterior distribution in Hellinger distance, 
for strictly positive prior over a countable set of continuous data-sampling distributions. 
By means of his martingale approach [7j, Walker developed a sufficient condition for the 
Hellinger consistency of posterior density in the above mentioned setting. Via a simple 
large-deviations approach we show that in this setting posterior density is always consistent 
in L-divergence. The consistency holds also under misspecification. If there are multiple 
'concentration points' (L-projections) the posterior spreads among them equally. 

2 Bayesian nonparametric consistency 

Let there be countable set M. = {q%, q2, ■ . ■ } of probability density functions with respect 
to the Lebesgue measure; sources, for short. On the set a Bayesian puts his strictly positive 
prior probability mass function 7r(-). Let r be the true source of a random sample X n = 
Xi,X2, ■ ■ ■ ,X n . Provided that r 6 M, as the sample size grows to infinity, the posterior 
distribution Tr(-\X n = x n ) over M. is expected to concentrate in a neighborhood of the true 
source r. Whether and under what conditions this indeed happens is a subject of Bayesian 
nonparametric consistency investigations. Surveys of the subject can be found at [2J, [5] 
among others. 

Ghosal, Ghosh and Ramamoorthi [5] define consistency of a sequence of posteriors with 
respect to a metric or discrepancy measure d as follows: The sequence {7r(-|X n ),n > 1} is 
said to be d-consistent at r, if there exists a 51q C M°° with r(fio) = 1 such that for u> £ f2o> 
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for every neighborhood U of r, ir(U\X n ) — > 1 as n goes to infinity. If a posterior is d- 
consistent for any r G M. then it is said to be <i-consistent. There, two modes of convergence 
are usually considered: convergence in probability and almost sure convergence. 

Obviously, in the definition the set of sources is not restricted to be countable. The 
present work is concerned with the countable M. case. 



3 Sanov's Theorem for Sources, L-consistency 

Let Ai e = {q : q G A4,ir(q) > 0} be support of the prior pmf. In what follows, r is not 
necessarily from M e . Thus we are interested also in Bayesian consistency under misspeci- 
fication; i.e., when n(r) = 0. The problem is the same as in the case of standard Bayesian 
consistency (cf. Sect. 2): to find the source(s) upon which the posterior concentrates. 

For two densities p, q with respect to the Lebesgue measur€0 A, the /-divergence I(p\ \q) = 
J p\og(p/q). The L-divergence L(q\\p) of q with respect to p is defined as L(q\\p) = 
— / plogq. The L-projection q of p on Q is q = arginf 9e g L(q\\p). There Q is a set of prob- 
ability densities defined on the same support. The value of L-divergence at an L-projection 
of p on Q is denoted by L(Q||p). 

The following Sanov's Theorem for Sources (LST) will be needed for establishing the 
consistency in L-divergence. The Theorem provides rate of the exponential decay of the 
posterior probability. 

LST Let M C M e . As n -> oo, 

-logn{q e M\x n ) -> -{LCSfUr) - L(M e \\r)}, 
n 

with probability one. 

Proof Let l n (q) = exp(J2™ =1 log q(Xi)), l n (A) = J2 q eA l n(<l), and Pn(q) = n(q)l n (q), 
p n {A) — YjqeA Pn(l)- ^ n this notation 7r(q G Af\x n ) = yjj^j- The posterior probability is 
bounded above and below as follows: 

MM) < n(q e N\x n ) < LW 



where l n (A) = swp qeA l n (q), p n {A) = sup qeA p n (q). 

i(logi n (AT) - log p n (M e )) converges with probability one to L(M e \\r) — L(J\f\\r). The 
same is the 'point' of a.s. convergence of ^ log of the lower bound. □ 

Let for e > 0, N?{M e ) = {q : L(q\\r) - L(M e \\r) > e,q G M e ). Let N e {M e ) = 
M e \Af e c . 

Corollary Let there be a finite number of L-projections of r on A4 e . As n — * oo, 
7r(<7 G J\fC (M e )\x n ) — > 0, with probability one. 

Standard Bayesian consistency follows as a special n(r) > case of the Corollary. 



x Any c-finite measure, in general. 
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4 Posterior Equi-concentration of Sources 



If there is more than one L-projection of r on M e , how is the posterior probability asymp- 
totically spread among them? This issue is 'in probability' answered by the next Theorem. 
Let A/^ 1 C M t (M e ) contain (among other sources) just one L-projection of r on A4 e . 

Theorem Let there be k L-projections of r on A4 e . Then for n going to infinity, ir(q £ 
Ml\x n ) — > h, in probability. 

Proof For any e > 0, there exists such no that for n > hq, r{x n : S(q\) = S(c[l)} = 1, 
where q\ = argsup geA ^ c ir(q\x n ), cjl is L-projection of r on A4 e , and S(-) stands for 'set of 
all'. Consequently, 7r(qz\x n ) > ir(q\x n ) for all q £ M e . Posterior n(q £ N}\x n ) can be ex- 
pressed as {I—A)/ (k(l—B)), where A = ^ Tr(q\x n )/n(q L \x n ), B 4 ^ n(q\x n ) /\n{q\x n )- 
o\ = J\f^\qL, o~2 — M. e \ Uj=i Vl' Markov's inequality implies that Tr(q\x n )/n(qL\x n ) con- 
verges to zero, in probability. Slutsky's Theorem then implies that A, B converges to zero, 
in probability. □ 

5 EndNotes 

In order to place this note in context let us make a few comments. 

1) An inverse of Sanov's Theorem has been established by Ganesh and O'Connell 1 
for the case of sources with finite alphabet, by means of formal large-deviations approach. 
Unaware of their work, the present author developed in [3] an inverse of Sanov's Theorem 
for n-sources, for both discrete and continuous alphabet and applied it to conditioning by 
rare sources problem and criterion choice problem; cf. also [4]. 

2) At [3] the concepts of L-divergence and L-projection were introduced. See [3] for a 
short discussion on why or why not the 'new' divergence. 

3) The present form of Sanov's Theorem for Sources (LST) as well as its proof are new. 

4) Bayesian consistency under misspecification has already been studied by Kleijn and 
van der Vaart [Hj for general setting of continuous prior on a set of continuous sources, using 
a different technique. The authors developed sufficient conditions for somewhat related 
consistency (cf. Corollary 2.1 and Lemma 6.4 of [S]) as well as rates of convergence. The 
equi-concentration was not considered there. 
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With a typo in statement and proof of LST (A4 e and M were interchanged) this note 
appeared as: M. Grendar, L-divergence consistency for a discrete prior, J. Stat. Res., 40(1), 
73-76, 2006. 
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